Goto

Collaborating Authors

 optimal stochastic and online learning


Optimal Stochastic and Online Learning with Individual Iterates

Neural Information Processing Systems

Stochastic composite mirror descent (SCMD) is a simple and efficient method able to capture both geometric and composite structures of optimization problems in machine learning. Existing strategies require to take either an average or a random selection of iterates to achieve optimal convergence rates, which, however, can either destroy the sparsity of solutions or slow down the practical training speed. In this paper, we propose a theoretically sound strategy to select an individual iterate of the vanilla SCMD, which is able to achieve optimal rates for both convex and strongly convex problems in a non-smooth learning setting. This strategy of outputting an individual iterate can preserve the sparsity of solutions which is crucial for a proper interpretation in sparse learning problems. We report experimental comparisons with several baseline methods to show the effectiveness of our method in achieving a fast training speed as well as in outputting sparse solutions.


Reviews: Optimal Stochastic and Online Learning with Individual Iterates

Neural Information Processing Systems

This paper proposes an online stochastic optimization algorithm (similar to SGD) that has optimal convergence rate of the last iterate in two settings (O(1/sqrt(T)) for Lipschitz convex functions and O(1/T) strongly convex functions), and additionally it allows an arbitrary non-smooth regularizer (e.g. Many subsets of the properties are achieved by prior works. Namely, it was known how to achieve these results up to O(log T) factors. It was known how to achieve the optimal rates with averaging, which, however, destroys sparsity. However, this paper has the first algorithm that has all the properties simultaneously and removes the log factors. The paper has rigorous proofs of the convergence rates and extensive numerical experiments.


Optimal Stochastic and Online Learning with Individual Iterates

Neural Information Processing Systems

Stochastic composite mirror descent (SCMD) is a simple and efficient method able to capture both geometric and composite structures of optimization problems in machine learning. Existing strategies require to take either an average or a random selection of iterates to achieve optimal convergence rates, which, however, can either destroy the sparsity of solutions or slow down the practical training speed. In this paper, we propose a theoretically sound strategy to select an individual iterate of the vanilla SCMD, which is able to achieve optimal rates for both convex and strongly convex problems in a non-smooth learning setting. This strategy of outputting an individual iterate can preserve the sparsity of solutions which is crucial for a proper interpretation in sparse learning problems. We report experimental comparisons with several baseline methods to show the effectiveness of our method in achieving a fast training speed as well as in outputting sparse solutions.


Optimal Stochastic and Online Learning with Individual Iterates

Neural Information Processing Systems

Stochastic composite mirror descent (SCMD) is a simple and efficient method able to capture both geometric and composite structures of optimization problems in machine learning. Existing strategies require to take either an average or a random selection of iterates to achieve optimal convergence rates, which, however, can either destroy the sparsity of solutions or slow down the practical training speed. In this paper, we propose a theoretically sound strategy to select an individual iterate of the vanilla SCMD, which is able to achieve optimal rates for both convex and strongly convex problems in a non-smooth learning setting. This strategy of outputting an individual iterate can preserve the sparsity of solutions which is crucial for a proper interpretation in sparse learning problems. We report experimental comparisons with several baseline methods to show the effectiveness of our method in achieving a fast training speed as well as in outputting sparse solutions.